add CLIP-ViT-L-scope #1022

Lewington-pitsos · 2024-11-12T08:40:17Z

No description provided.

Wauplin · 2024-11-13T08:37:52Z

packages/tasks/src/model-libraries.ts

+	"clip-scope": {
+		prettyLabel: "CLIP-ViT-L-scope",
+		repoName: "CLIP-ViT-L-scope",
+		repoUrl: "https://github.com/Lewington-pitsos/vitsae", 


Hi @Lewington-pitsos, thanks for opening a PR :) Just to understand, what is the library here? For what I can see:

clip-scope seems like a generic name that is not mentioned in both the model card and the github repo

clipscope seems to be a Python library ("from clipscope import ConfiguredViT, TopKSAE" in the model card) but can't see how to install it. At least it doesn't seem to be on PyPI.

CLIP-ViT-L-scope is a very specific name (the name of the repo on the Hub)

vitsae is the name of the repo on the Hub. Is it planned to be a published library at some point?

I'm asking this because depending on what you plan to do here, the integration with the Hub could change. Typically, do you think other repos might use the same library in the future or will it be a "one-shot"? We usually recommend to have a consistent naming between the library and repo tags to avoid confusion. Also, the name of the library is usually different from the name of the repo on the Hub, especially when multiple models are linked to the same library.

Wauplin · 2024-11-13T08:44:54Z

packages/tasks/src/model-libraries.ts

+		repoName: "CLIP-ViT-L-scope",
+		repoUrl: "https://github.com/Lewington-pitsos/vitsae", 
+		filter: false,
+		countDownloads: `path_extension:"pt"`,


By doing so, any download of any pt file in the model will count as a new download. This means that if a user downloads the full repo, we will count it as X downloads (~80?). It's ok to do so if you think users will only download files one by one depending on their use case.

Another problem is that all downloads are tracked as 1 metric and you won't be able to know if users are more interested by a layer or another. A solution for this would be to create N models on the Hub (1 per layer for instance) and then create a Collection so group them all. For example you can find all models related to the Gemma scope release in this collection: https://huggingface.co/collections/google/gemma-scope-release-66a4271f6f0b4d4a9d5e04e2. A benefit of having a collection of models is that downloads will be tracked for each model individually, giving you more insights on the usage of your library.

If you worry about the cost of maintaining all these repos, I'd recommend automating the upload and consistency between model cards using a script based on huggingface_hub. Let me know what you think :)

In terms of tracking, these models are all derived from the same training process and a common use case for them should be to compare how they perform against one another. In a sense collectively they comprise a single tool. https://huggingface.co/google/gemma-scope-2b-pt-res from which we took inspiration takes a similar approach and places large numbers of related models together as a single "model".

Makes sense to keep countDownloads: path_extension:"pt", in that case!

Lewington-pitsos · 2024-11-14T06:15:04Z

Thanks for the super quick feedback, I have tried to make the names more consistent as suggested, I have also changed the repo to point to the github repo for using the files (clipscope) rather than the one I used to train the models (vitsae). The clipscope repo is much more likely to be used by others in future than vitsae.

I seem to be able to access the clipscope pypi package (https://pypi.org/project/clipscope/) and pip install clipscope from the usage section works for me, does it fail for you?

In terms of tracking, these models are all derived from the same training process and a common use case for them should be to compare how they perform against one another. In a sense collectively they comprise a single tool. https://huggingface.co/google/gemma-scope-2b-pt-res from which we took inspiration takes a similar approach and places large numbers of related models together as a single "model".

Personally I am most interested in the extent to which people are using the tool in any respect than I am in which parts of it are being used most.

Wauplin

Thanks for the clarification @Lewington-pitsos! Then I think it makes sense to integrate the library as mentioned below.

Also, to make it connected to your model repo on the Hub, you'll have to edit the model card metadata to add library_name: clipscope here. Once that's done and the comments below are addressed, we should be able to merge :)

Wauplin · 2024-11-14T10:04:39Z

packages/tasks/src/model-libraries.ts

+	"clip-vit-l-scope": {
+		prettyLabel: "CLIP-ViT-L-scope",
+		repoName: "CLIP-ViT-L-scope",


Suggested change

"clip-vit-l-scope": {

prettyLabel: "CLIP-ViT-L-scope",

repoName: "CLIP-ViT-L-scope",

"clipscope": {

prettyLabel: "clipscope",

repoName: "clipscope",

Given the library is https://github.com/Lewington-pitsos/clipscope, I'd recommend to tag the model as "clipscope" instead.

Wauplin · 2024-11-14T10:05:32Z

packages/tasks/src/model-libraries.ts

+		repoName: "CLIP-ViT-L-scope",
+		repoUrl: "https://github.com/Lewington-pitsos/vitsae", 
+		filter: false,
+		countDownloads: `path_extension:"pt"`,


In terms of tracking, these models are all derived from the same training process and a common use case for them should be to compare how they perform against one another. In a sense collectively they comprise a single tool. https://huggingface.co/google/gemma-scope-2b-pt-res from which we took inspiration takes a similar approach and places large numbers of related models together as a single "model".

Makes sense to keep countDownloads: path_extension:"pt", in that case!

add CLIP-ViT-L-scope

bc2619f

Lewington-pitsos requested review from SBrandeis, gary149, Wauplin, julien-c, pcuenca and ngxson as code owners November 12, 2024 08:40

Wauplin reviewed Nov 13, 2024

View reviewed changes

make names more consistent, point to usable repo

37993bf

Wauplin reviewed Nov 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add CLIP-ViT-L-scope #1022

add CLIP-ViT-L-scope #1022

Lewington-pitsos commented Nov 12, 2024

Wauplin Nov 13, 2024

Wauplin Nov 13, 2024

Wauplin Nov 14, 2024

Lewington-pitsos commented Nov 14, 2024 •

edited

Loading

Wauplin left a comment

Wauplin Nov 14, 2024

Wauplin Nov 14, 2024

add CLIP-ViT-L-scope #1022

Are you sure you want to change the base?

add CLIP-ViT-L-scope #1022

Conversation

Lewington-pitsos commented Nov 12, 2024

Wauplin Nov 13, 2024

Choose a reason for hiding this comment

Wauplin Nov 13, 2024

Choose a reason for hiding this comment

Wauplin Nov 14, 2024

Choose a reason for hiding this comment

Lewington-pitsos commented Nov 14, 2024 • edited Loading

Wauplin left a comment

Choose a reason for hiding this comment

Wauplin Nov 14, 2024

Choose a reason for hiding this comment

Wauplin Nov 14, 2024

Choose a reason for hiding this comment

Lewington-pitsos commented Nov 14, 2024 •

edited

Loading